PRIVATEER: A Private Record Linkage Toolkit

نویسندگان

  • Alexandros Karakasidis
  • Georgia Koloniari
  • Vassilios S. Verykios
چکیده

Privacy preserving record linkage (PPRL) is the process of integrating data across multiple heterogeneous data sources without compromising their privacy. While many techniques have been developed for PPRL, there has not been, up to now, a universally accepted method providing both performance and quality of results in all cases. To this end, we present PRIVATEER, a toolkit which aims at enabling practitioners to compare various techniques involved in the PPRL process and determine the best for their needs. The toolkit is based on a simulator, designed to be highly configurable, modular and extensible, allowing the user to test different configurations by combining a number of privacy preserving blocking and matching methods with corresponding distance and similarity measures on her own or sample data. We showcase the usability of our toolkit by presenting experimental results measuring both quality and performance of state-of-the-art PPRL methods.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Towards an Open Source Toolkit for Building Record Linkage Workflows

Record linkage has been subject of research for several decades, and a huge number of record linkage solutions have been proposed, based on probabilistic and empirical paradigms. However, record linkage is a complex process, for the execution of which one single technique is often not enough; it can be seen as composed by distinct phases, each requiring a specific technique and depending on giv...

متن کامل

Scaling Private Record Linkage using Output Constrained Differential Privacy

Many scenarios require computing the join of databases held by two or more parties that do not trust one another. Private record linkage is a cryptographic tool that allows such a join to be computed without leaking any information about records that do not participate in the join output. However, such strong security comes with a cost: except for exact equi-joins, these techniques have a high ...

متن کامل

Private record linkage with Bloom filters

In many record linkage applications, identifiers have to be encrypted to preserve privacy. Therefore, a method for approximate string comparison in private record linkage is needed. We describe a new method of approximate string comparison in private record linkage. The main idea is to store q-grams sets derived from identifier values in Bloom filters and compare them bitwise across databases. ...

متن کامل

Sorted Nearest Neighborhood Clustering for Efficient Private Blocking

Record linkage is an emerging research area which is required by various real-world applications to identify which records in different data sources refer to the same real-world entities. Often privacy concerns and restrictions prevent the use of traditional record linkage applications across different organizations. Linking records in situations where no private or confidential information can...

متن کامل

An Empirical Comparison of Approaches to Approximate String Matching in Private Record Linkage

Due to the frequency of spelling and typographical errors in practical applications, record linkage algorithms have to use string similarity functions. In many legal contexts, identifiers such as names have to be encrypted before a record linkage can be attempted. Therefore, algorithms for computing string similarity functions with encrypted identifiers are essential for approximating string ma...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2015